Search CORE

276 research outputs found

Improving Sampling from Generative Autoencoders with Markov Chains

Author: Arulkumaran K
Bharath AA
Creswell A
Publication venue
Publication date: 31/12/2016
Field of study

We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model. We define generative autoencoders as autoencoders which are trained to softly enforce a prior on the latent distribution learned by the model. However, the model does not necessarily learn to match the prior. We formulate a Markov chain Monte Carlo (MCMC) sampling process, equivalent to iteratively encoding and decoding, which allows us to sample from the learned latent distribution. Using this we can improve the quality of samples drawn from the model, especially when the learned distribution is far from the prior. Using MCMC sampling, we also reveal previously unseen differences between generative autoencoders trained either with or without the denoising criterion

Spiral - Imperial College Digital Repository

Plasma Concentrations of Tranexamic Acid in Postpartum Women After Oral Administration.

Author: Arulkumaran S
Balakumar S
Muhunthan K
Navaratnaraja TS
Premakrishna S
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/04/2020
Field of study

OBJECTIVE: To evaluate the pharmacokinetics of tranexamic acid after oral administration to postpartum women. METHODS: We conducted a single-center pharmacokinetic study at Teaching Hospital-Jaffna, Sri Lanka, on 12 healthy postpartum women who delivered vaginally. After oral administration of 2 g of immediate-release tranexamic acid 1 hour after delivery, pharmacokinetic parameters were measured on plasma samples at 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 5, 6, 8, 10, and 12 hours. Plasma tranexamic acid concentrations were determined by high-performance liquid chromatography. The outcome measures were maximum observed plasma concentration, time to maximum plasma concentration, time to reach effective plasma concentration, time period effective serum concentration lasted, area under the curve for drug concentration, and half-life of tranexamic acid. RESULTS: The mean maximum observed plasma concentration was 10.06 micrograms/mL (range 8.56-12.22 micrograms/mL). The mean time to maximum plasma concentration was 2.92 hours (range 2.5-3.5 hours). Mean time taken to reach the effective plasma concentration of 5 micrograms/mL and the mean time this concentration lasted were 0.87 hours and 6.73 hours, respectively. Duration for which plasma tranexamic acid concentration remained greater than 5 micrograms/mL was 5.86 hours. Half-life was 1.65 hours. Area under the curve for drug concentration was 49.16 micrograms.h/mL (range 43.75-52.69 micrograms.h/mL). CONCLUSION: Clinically effective plasma concentrations of tranexamic acid in postpartum women may be achieved within 1 hour of oral administration. Given the promising pharmacokinetic properties, we recommend additional studies with larger sample sizes to investigate the potential of oral tranexamic acid for the treatment or prophylaxis of postpartum hemorrhage

Crossref

St George's Online Research Archive

The Dreaming Variational Autoencoder for Reinforcement Learning Environments

Author: K Arulkumaran
P-A Andersen
RS Sutton
SS Mousavi
V Mnih
Publication venue
Publication date: 01/01/2018
Field of study

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.Comment: Best Student Paper Award, Proceedings of the 38th SGAI International Conference on Artificial Intelligence, Cambridge, UK, 2018, Artificial Intelligence XXXV, 201

arXiv.org e-Print Archive

Crossref

NORA - Norwegian Open Research Archives

Agder University Research Archive

A pragmatic look at deep imitation learning

Author: Arulkumaran K
Lillrank DO
Publication venue: 'Center for Open Science'
Publication date: 04/08/2021
Field of study

The introduction of the generative adversarial imitation learning (GAIL) algorithm has spurred the development of scalable imitation learning approaches using deep neural networks. The GAIL objective can be thought of as 1) matching the expert policy's state distribution; 2) penalising the learned policy's state distribution; and 3) maximising entropy. While theoretically motivated, in practice GAIL can be difficult to apply, not least due to the instabilities of adversarial training. In this paper, we take a pragmatic look at GAIL and related imitation learning algorithms. We implement and automatically tune a range of algorithms in a unified experimental setup, presenting a fair evaluation between the competing methods. From our results, our primary recommendation is to consider non-adversarial methods. Furthermore, we discuss the common components of imitation learning objectives, and present promising avenues for future research

Spiral - Imperial College Digital Repository

Deep unsupervised clustering with Gaussian mixture variational autoencoders

Author: Arulkumaran K
Dilokthanakul N
Garnelo M
Lee MCH
Mediano PAM
Salimbeni H
Shanahan M
Publication venue
Publication date: 08/11/2016
Field of study

We study a variant of the variational autoencoder model with a Gaussian mixture as a prior distribution, with the goal of performing unsupervised clustering through deep generative models. We observe that the standard variational approach in these models is unsuited for unsupervised clustering, and mitigate this problem by leveraging a principled information-theoretic regularisation term known as consistency violation. Adding this term to the standard variational optimisation objective yields networks with both meaningful internal representations and well-defined clusters. We demonstrate the performance of this scheme on synthetic data, MNIST and SVHN, showing that the obtained clusters are distinct, interpretable and result in achieving higher performance on unsupervised clustering classification than previous approaches

Spiral - Imperial College Digital Repository

Deep Reinforcement Learning for Join Order Enumeration

Author: Aboulnaga A.
Arulkumaran K.
Glorot X.
Hester T.
Ono K.
Selinger P. G.
Stillger M.
Wang Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/03/2018
Field of study

Join order selection plays a significant role in query performance. However, modern query optimizers typically employ static join enumeration algorithms that do not receive any feedback about the quality of the resulting plan. Hence, optimizers often repeatedly choose the same bad plan, as they do not have a mechanism for "learning from their mistakes". In this paper, we argue that existing deep reinforcement learning techniques can be applied to address this challenge. These techniques, powered by artificial neural networks, can automatically improve decision making by incorporating feedback from their successes and failures. Towards this goal, we present ReJOIN, a proof-of-concept join enumerator, and present preliminary results indicating that ReJOIN can match or outperform the PostgreSQL optimizer in terms of plan quality and join enumeration efficiency

arXiv.org e-Print Archive

Crossref

A brief survey of deep reinforcement learning

Author: Arulkumaran K
Bharath AA
Brundage M
Deisenroth MP
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2017
Field of study

Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higherlevel understanding of the visual world. Currently, deep learning is enabling reinforcement learning (RL) to scale to problems that were previously intractable, such as learning to play video games directly from pixels. DRL algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of RL, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via RL. To conclude, we describe several current areas of research within the field

Spiral - Imperial College Digital Repository

On the link between conscious function and general intelligence in humans and machines

Author: Arulkumaran K
Juliani A
Kanai R
Sasai S
Publication venue: 'Center for Open Science'
Publication date: 24/03/2022
Field of study

In popular media, there is often a connection drawn between the advent of awareness in artificial agents and those same agents simultaneously achieving human or superhuman level intelligence. In this work, we explore the validity and potential application of this seemingly intuitive link between consciousness and intelligence. We do so by examining the cognitive abilities associated with three contemporary theories of conscious function: Global Workspace Theory (GWT), Information Generation Theory (IGT), and Attention Schema Theory (AST). We find that all three theories specifically relate conscious function to some aspect of domain-general intelligence in humans. With this insight, we turn to the field of Artificial Intelligence (AI) and find that, while still far from demonstrating general intelligence, many state-of-the-art deep learning methods have begun to incorporate key aspects of each of the three functional theories. Given this apparent trend, we use the motivating example of mental time travel in humans to propose ways in which insights from each of the three theories may be combined into a unified model. We believe that doing so can enable the development of artificial agents which are not only more generally intelligent but are also consistent with multiple current theories of conscious function

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Deep Reinforcement Learning: A Brief Survey

Author: Arulkumaran K
Bharath AA
Brundage M
Deisenroth MP
Publication venue: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date: 01/11/2017
Field of study

Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higher-level understanding of the visual world. Currently, deep learning is enabling reinforcement learning (RL) to scale to problems that were previously intractable, such as learning to play video games directly from pixels. DRL algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of RL, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via RL. To conclude, we describe several current areas of research within the field

UCL Discovery

Memory-efficient episodic control reinforcement learning with dynamic online k-means

Author: Agostinelli A
Arulkumaran K
Bharath AA
Richemond P
Sarrico M
Publication venue
Publication date: 21/11/2019
Field of study

Recently, neuro-inspired episodic control (EC) methods have been developed to overcome the data-inefficiency of standard deep reinforcement learning approaches. Using non-/semi-parametric models to estimate the value function, they learn rapidly, retrieving cached values from similar past states. In realistic scenarios, with limited resources and noisy data, maintaining meaningful representations in memory is essential to speed up the learning and avoid catastrophic forgetting. Unfortunately, EC methods have a large space and time complexity. We investigate different solutions to these problems based on prioritising and ranking stored states, as well as online clustering techniques. We also propose a new dynamic online k-means algorithm that is both computationally-efficient and yields significantly better performance at smaller memory sizes; we validate this approach on classic reinforcement learning environments and Atari games

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository